Inspecting operations of a machine to detect elephant flows

ABSTRACT

Some embodiments provide a system that detects whether a data flow is an elephant flow; and if so, the system treats it differently than a mouse flow. The system of some embodiments detects an elephant flow by examining, among other items, the operations of a machine. In detecting, the system identifies an initiation of a new data flow associated with the machine. The new data flow can be an outbound data flow or an inbound data flow. The system then determines, based on the amount of data being sent or received, if the data flow is an elephant flow. The system of some embodiments identifies the initiation of a new data flow by intercepting a socket call or request to transfer a file.

CLAIM OF BENEFIT TO PRIOR APPLICATIONS

This application is a continuation application of U.S. patentapplication Ser. No. 15/972,232, filed May 7, 2018, now published asU.S. Patent Publication 2018/0331961. U.S. patent application Ser. No.15/972,232 is a continuation application of U.S. patent application Ser.No. 14/502,102, filed Sep. 30, 2014, now issued as U.S. Pat. No.9,967,199. U.S. patent application Ser. No. 14/502,102 claims thebenefit of U.S. Provisional Patent Application 61/913,899, filed Dec. 9,2013, U.S. Provisional Patent Application 61/973,255, filed Mar. 31,2014, and U.S. Provisional Patent Application 62/010,944, filed Jun. 11,2014. U.S. patent application Ser. No. 15/972,232, now published as U.S.Patent Publication 2018/0331961, U.S. patent application Ser. No.14/502,102, now issued as U.S. Pat. No. 9,967,199, and U.S. ProvisionalPatent Applications 61/913,899, 61/973,255, and 62/010,944 areincorporated herein by reference.

BACKGROUND

Today, a datacenter may process different types of flows, includingelephant flows and mouse flows. An elephant flow represents a long-livedflow or a continuous traffic flow that is typically associated with highvolume connection. Different from an elephant flow, a mouse flowrepresents a short-lived flow. Mice are often associated with bursty,latency-sensitive applications, whereas elephants tend to be associatedwith large data transfers in which throughput is far more important thanlatency.

A problem with elephant flows is that they tend to fill network buffersend-to-end, and this introduces non-trivial queuing delay to anythingthat shares these buffers. For instance, a forwarding element may beresponsible for managing several queues to forward packets, and severalpackets belonging to a mouse flow may be stuck in the same queue behinda group of other packets belonging to an elephant flow. In a network ofelephants and mice, this means that the more latency-sensitive mice arebeing affected. Another problem is that mice are generally very bursty,so adaptive routing techniques are not effective with them.

BRIEF SUMMARY

Some embodiments provide a system that detects whether a data flow is anelephant flow; and if so, the system treats it differently than a mouseflow. The system of some embodiments detects an elephant flow byexamining, among other items, the operations of a machine. The elephantflow represents a long-lived data flow or a continuous traffic flow thatis associated with large data transfer. In some embodiments, the machineis a physical machine or a virtual machine (VM). In detecting, thesystem uses machine introspection to identify an initiation of a newdata flow associated with the machine. The new data flow can be anoutbound data flow or an inbound data flow. The system then tracks howmuch data the machine is sending or receiving through the connection,and determines, based on the amount of data being sent or received, ifthe data flow is an elephant flow.

Different embodiments use different techniques to identify theinitiation of a new flow of data that is associated with the machine. Insome embodiments, the system identifies a new data flow by interceptinga network connection that is being opened on the machine. The connectioncan be an inbound network connection and/or an outbound networkconnection. In intercepting, the system of some embodiments performs anetwork introspection operation on the machine to intercept a socketcall being made to open the new network connection.

The system of some embodiments identifies the initiation of a new dataflow by capturing a request to transfer (e.g., send or receive) a file.That is, rather than through a low-level socket call, the system of someembodiments detects an operating system (OS)/library or applicationprogramming interface (API) call to send or receive a file. The call maybe associated with a particular network protocol for transferring filesfrom one network host to another network host. Examples of differentwidely used data transfer protocols include file transfer protocol(FTP), Secure Shell (SSH) file transfer protocol, Bit Torrent, etc.

In some embodiments, the system uses one of several different methods todetermine the amount of data that is being transferred in a data flow.For some embodiments that detect elephant flows based on file transferrequests, the system makes this determination based on the size of thefile that is being transferred. For instance, the system can identifythe size of the file, and if the file is over a threshold size, thesystem can specify that the associated data flow is an elephant flow.

Instead of identifying a file size, the system of some embodimentstracks the amount of data that has been transferred (e.g., sent orreceived). In some embodiments, the system tracks the data sizeassociated with every packet transferred in a data flow (e.g., through anetwork connection). For instance, the system may calculate the amountof data transferred by accumulating or adding the number of bytestransferred with each packet. If the number of bytes is over a thresholdvalue, the system of some embodiments declares the data flow to be anelephant flow. In some embodiments, the system includes a machine thatmarks packet with a marking and a packet inspection agent uses the markto track the amount of data sent and identify an elephant flow if theamount is over a threshold value.

In conjunction with byte count or instead of it, the system of someembodiments factors in time. As an example, the system might detect anelephant flow solely based on how long the data flow has been associatedwith the machine. That is, if the duration of the data flow is over aset period of time, the system might determine that the data flow is anelephant flow. The duration can be calculated based on how long anetwork connection has been opened to handle the data flow. Also,instead of byte count, the process might calculate bit rate or bytes persecond. The bit rate can be used to allow elephant flows with slow datatransfer rate to progress as normal. This is because such elephant flowswith slow transfer rate may not be contributing or at leastsignificantly in the latency of other data flows, such as mice flows andnon-detected elephant flows.

The system of some embodiments identifies one or more pieces ofinformation that provide context regarding the detected elephant flow.In some embodiments, the context information is used to identify theelephant flow. The context information may include user data,application data, and/or machine data. For instance, the system mayidentify the name of the source machine, the address (e.g., MAC address,IP address) associated with the source machine, the address associatedwith a destination machine, a port number (e.g., TCP port number, UDPport number), the application that initiated the call, user data (e.g.,username of the person that is logged onto the machine), etc. In someembodiments, the source of the data flow is identified by source MACaddress, source IP address, and source port. The destination may also beidentified by same set of tuples or fields.

Once an elephant flow is detected, the system of some embodiments treatsthe detected elephant flow differently than other flows (e.g., mouseflows, non-detected elephant flows). In some embodiments, the systemreports the elephant flow and the associated context information (e.g.,MAC address, IP address, etc.) to an agent that is interested in theelephant flow. For instance, the system may send a report to aforwarding element, such as a switch or router. The forwarding elementmay then use Quality of Service (QOS) configuration to place packetsbelonging to the elephant flow in a particular queue that is separatefrom one or more other queues with other packets. In this manner, oneset of packets belonging to a mouse flow is not stuck in the same queuebehind another set of packets belonging to an elephant flow.

Alternatively, the system may send packets associated with an elephantflow along different paths (e.g., equal-cost multipath routing (ECMP)legs) to break the elephant flow into mice flows. As another example,the system may send elephant flow traffic along a separate physicalnetwork, such as an optical network that is more suitable for slowchanging, bandwidth-intensive traffic. In some embodiments, the systemreports the elephant flow to a network controller (e.g., asoftware-defined networking controller) that can configure one or moreforwarding elements to handle the elephant flow.

Additional techniques for detecting and handling elephant flows aredescribed in U.S. patent application Ser. No. 14/231,647, entitled“Detecting and Handling Elephant Flows”, filed Mar. 31, 2014, now issuedas U.S. Pat. No. 10,193,771. Furthermore, several embodiments thatdetect an elephant flow based on the size of a packet are described inU.S. patent application Ser. No. 14/231,652, entitled “Detecting anElephant Flow Based on the Size of a Packet”, filed Mar. 31, 2014, nowissued as U.S. Pat. No. 9,548,924. Some embodiments that report elephantflows to a network controller are described in U.S. patent applicationSer. No. 14/231,654, entitled “Reporting Elephant Flows to a NetworkController, filed Mar. 31, 2014, now issued as U.S. Pat. No. 10,158,538.These U.S. patent applications are incorporated herein by reference.

The preceding Summary is intended to serve as a brief introduction tosome embodiments as described herein. It is not meant to be anintroduction or overview of all subject matter disclosed in thisdocument. The Detailed Description that follows and the Drawings thatare referred to in the Detailed Description will further describe theembodiments described in the Summary as well as other embodiments.Accordingly, to understand all the embodiments described by thisdocument, a full review of the Summary, Detailed Description and theDrawings is needed. Moreover, the claimed subject matters are not to belimited by the illustrative details in the Summary, Detailed Descriptionand the Drawings, but rather are to be defined by the appended claims,because the claimed subject matters can be embodied in other specificforms without departing from the spirit of the subject matters.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth in the appendedclaims. However, for purposes of explanation, several embodiments of theinvention are set forth in the following figures.

FIG. 1 illustrates an example of inspecting operations of a machine todetect an elephant flow.

FIG. 2 conceptually illustrates a process that some embodimentsimplement to detect an elephant flow.

FIG. 3 illustrates an example of inspecting operations of a machine todetect an elephant flow.

FIG. 4 illustrates a process that some embodiments implement tointercept a new connection for a data flow and determine whether thedata flow is an elephant flow.

FIG. 5 illustrates an example of intercepting a high-level applicationcall to detect an elephant flow.

FIG. 6 illustrates a process that some embodiments implement tointercept a high-level application call to transmit a file on a socket.

FIG. 7 a process that some embodiments implement to analyze context datarelating to an application call to detect an elephant flow.

FIG. 8 illustrates an example of detecting an elephant flow using acombination of machine introspection and packet inspection.

FIG. 9 illustrates a process that some embodiments implement tointercept a new connection and to specify a unique marking for packetssent over the connection.

FIG. 10 illustrates a process that some embodiments implement to performpacket inspection in order to detect an elephant flow.

FIG. 11 conceptually illustrates an electronic system with which someembodiments of the invention are implemented.

DETAILED DESCRIPTION

In the following detailed description of the invention, numerousdetails, examples, and embodiments of the invention are set forth anddescribed. However, it will be clear and apparent to one skilled in theart that the invention is not limited to the embodiments set forth andthat the invention may be practiced without some of the specific detailsand examples discussed.

Embodiments described herein provide a system that detects whether adata flow is an elephant flow; and if so, the system treats itdifferently than a mouse flow. The system of some embodiments detects anelephant flow by examining, among other things, the operations of amachine. The elephant flow represents a long-lived data flow or acontinuous traffic flow that is associated with large data transfer. Insome embodiments, the machine is a physical machine or a virtual machine(VM). In detecting, the system identifies an initiation of a new dataflow associated with the machine. The new data flow can be an outbounddata flow or an inbound data flow. The system then determines, based onthe amount of data being sent or received, if the data flow is anelephant flow.

For some embodiments of the invention, FIG. 1 illustrates an example ofinspecting operations of a machine 100 to detect an elephant flow.Specifically, this figure illustrates in four stages 105-120 an exampleof (1) detecting an initiation or a new data flow, (2) determining theamount of data being sent or received in the data flow, and (3)specifying that the data flow is an elephant flow based on the amount ofdata. The figure includes the machine 100 that runs on the operatingsystem (OS) 140. The OS 140 includes a data flow interception agent 140,an elephant detector 125, and a network stack 130. The figure alsoincludes a forwarding element 135 that forwards packets for the machine100. The term “packet” is used here as well as throughout thisapplication to refer to a collection of bits in a particular format sentacross a network. It should be understood that the term “packet” may beused herein to refer to various formatted collections of bits that maybe sent across a network, such as Ethernet frames, TCP segments, UDPdatagrams, IP packets, etc.

In the example of FIG. 1 , the machine 100 can be a physical machine ora virtual machine (VM). Different from a physical or dedicated machine,the VM runs on a hypervisor of a host machine. The hypervisor can be atype 1 bare-metal hypervisor or a type 2 hosted hypervisor. Forinstance, the hypervisor can be a bare-metal hypervisor, which serves asa software abstraction layer that runs on top of the machine's hardwareand runs below any OS.

In some embodiments, the data flow interception agent 145 is a componentof the OS 140 that is responsible for detecting an initiation of a newdata flow associated with the machine 100. The data flow can be aninbound data flow, which means that the data will be sent to the machinefrom another machine or network host. Alternatively, the data flow canbe an outbound data flow, which means that the data will be sent fromthe machine to another machine. In detecting a new data flow, the dataflow interception agent 145 may intercept each new network connectionthat is being initiated or opened on the machine. For instance, the dataflow interception agent may perform a network introspection operation tointercept a socket call being made to open a new network connection tohandle the data flow.

Alternatively, the data flow interception agent 145 of some embodimentsidentifies each new data flow based on a request to send or receive afile. That is, rather than through a low-level socket call, the dataflow interception agent detects an OS/library or application programminginterface (API) call to transfer (e.g., send or receive) a file. Thecall may be associated with a particular protocol to transfer files fromone network host to another network host. Examples of different widelyused data transfer protocols include file transfer protocol (FTP),Secure Shell (SSH) file transfer protocol, Bit Torrent, etc.

In some embodiments, the data flow interception agent 145 is a nativecomponent of the OS. That is, the data flow interception agent comespreinstalled with the OS. The data flow interception agent 145 may be acomponent that is installed on the OS. For instance, the data flowinterception may be a thin agent or a part of a thin agent that isinstalled on the machine to perform network introspection. The thinagent may operate on the machine to intercept different types of eventsthat are occurring on the machine. For instance, the thin agent mayinclude a network introspection module to intercept each system call toopen a new network connection, a file introspection module to intercepteach system call to open a data file, etc.

Different from the data flow interception agent 145, the elephantdetector 125 is responsible for determining whether the identified dataflow is an elephant flow rather than a mouse flow. In determining, theelephant detector 125 of some embodiments identifies that the amount ofdata being sent or received is over a threshold value. The elephantdetector can use one of several different methods to determine theamount of data that is being transferred with a data flow. For someembodiments that detect elephant flows based on file transfer requests,the elephant detector 125 makes this determination based on the size ofthe file that is being transferred in the data flow. For instance, theelephant detector 125 can identify the size of the file, and if the fileis over the threshold value, the elephant detector can specify that thedata flow is an elephant flow.

Instead of identifying a file size, the elephant detector 125 of someembodiments tracks the amount of data that has been transferred (e.g.,sent or received). In some embodiments, the elephant detector 125 tracksthe data size associated with every packet transferred in a data flow(e.g., through a network connection). For instance, the elephantdetector 125 may calculate the amount of data transferred byaccumulating or adding the number of bytes transferred with each packet.If the number of bytes is over a threshold value, the elephant detector125 of some embodiments declares the data flow to be an elephant flow.

In conjunction with byte count or instead of it, the elephant detector125 of some embodiments factors in time. As an example, the elephantdetector 125 might detect an elephant flow solely based on how long thedata flow has been associated with the machine. That is, if the durationof the data flow is over a set period of time, the elephant detector 125might determine that the data flow is an elephant flow. The duration canbe calculated based on how long a network connection has been opened tohandle the data flow. Also, instead of byte count, the process mightcalculate bit rate or bytes per second. The bit rate can be used toallow elephant flows with slow data transfer rate to progress as normal.This is because such elephant flows with slow transfer rate may not becontributing or at least significantly in the latency of other dataflows, such as mice flows and non-detected elephant flows.

In the example of FIG. 1 , the elephant detector 125 is shown as acomponent that runs on the OS 140 of the machine 100. However, theelephant detector 125 may be a part of a hypervisor that executes on ahost machine, a component or module that executes on a separate machine(e.g., virtual or physical machine), or a dedicated appliance.

The network stack 130 conceptually represents the OS's implementation ofa number of different protocols to send and receive data through anetwork. The network stack 130 may be a part of the OS's kernel. Thenetwork stack, such as the TCP/IP network stack, is used to process datathrough several different layers. For instance, when outputting datafrom the machine, the data may be sent to a socket buffer and processedat the TCP layer to create TCP segments or packets. Each segment is thenprocessed by a lower layer, such as the IP layer to add an IP header.The output of the network stack is a set of packets associated with theoutbound data flow. Each packet may be processed (e.g., segmented) bythe machine's network interface card, and sent to the forwarding element135. On the other hand, when receiving data at the machine, each packetmay be processed by one or more of the layers in reverse order to stripone or more headers, and place the user data or payload in an inputsocket buffer.

In some embodiments, the forwarding element 135 is a hardware-forwardingelement. The hardware-forwarding element can have application-specificintegrated circuits (ASICs) that are specifically designed to supportin-hardware forwarding. Alternatively, the physical forwarding element135 can be a software-forwarding element, such as Open vSwitch (OVS). Insome embodiments, the forwarding element (e.g., software or hardwareforwarding element) is a physical forwarding element that operates inconjunction with one or more other physical forwarding elements tocollectively implement different logical forwarding elements (e.g.,logical switches, logical routers, etc.) for different logical networksof different tenants, users, departments, etc. that use the same sharedcomputing and networking resources. Accordingly, the term “physicalforwarding element” is used herein to differentiate it from a logicalforwarding element.

In some embodiments, the forwarding element 135 is an edge forwardelement (EFE). In some embodiments, the edge forwarding elementrepresents a last forwarding element before one or more end machines(e.g., the machine 100). Alternatively, the forwarding element 135 canbe a non-edge forwarding (NEFE). Irrespective of whether the forwardingelement 135 is positioned at the edge of the network or not, theforwarding element is configured to treat packets associated with adetected elephant flow differently than other packets associated withother data flows (e.g., mouse flows, non-detected elephant flows).

Having described several components of FIG. 1 , example operations ofthese components will now be described by reference to the four stages105-120 that are illustrated in the figure. The first stage 105 showsthat the machine 100 is communicatively coupled to the forwardingelement 135. The data flow interception agent 145 operates on the OS 140of the machine 100 to detect an initiation of a new data flow. In thefirst stage 105, the data flow interception agent 145 has detected a newdata flow being initiated. The data flow interception agent 145 mighthave made the detection by intercepting a call to open or use aparticular network connection to handle the new data flow. In someembodiments, the data flow interception agent 145 detects a new dataflow by intercepting an application request to send or receive a file.

The second stage 110 shows the data flow interception agent 145 sendingcontext data associated with the data flow to the elephant detector 125.As mentioned above, the context information can include user data,application data, and/or machine data. For instance, the system mayidentify the name of the source machine, the address (e.g., MAC address,IP address) associated with the source machine, the address associatedwith a destination machine, a port number associated the source and/ordestination, the application that initiated the call, user data (e.g.,username of the person that is logged onto the machine, etc. In someembodiments, the source is identified as a combination of source IPaddress, and port number; and the destination is identified by acombination of destination IP address and port number.

The second stage 110 also shows that the new data flow has been started.This is conceptually shown with several packets being forwarded by theforwarding element 135.

The third stage 115 shows the elephant detector 125 determining that thedata flow is an elephant flow. As mentioned above, the elephant flowdetector 125 can make this determination based on the size of the filethat is being transferred in the data flow. For instance, the elephantdetector can identify the size of the file; and if the file is over thethreshold value, the elephant detector can specify that the data flow isan elephant flow. Instead of identifying a file size, the elephantdetector may track the amount of data that has been transferred. Forinstance, the elephant detector may calculate the amount of datatransferred by accumulating or adding the number of bytes transferredwith each packet.

The fourth stage 120 shows an example operation of the elephant detector125 upon detecting the elephant flow. Here, the elephant flow detector125 reports the elephant flow to the forwarding element 135. Theforwarding element 135 may then use Quality of Service (QOS)configuration to place packets belonging to the elephant flow in aparticular queue that is separate from one or more other queues withother packets). In this manner, one set of packets belonging to a mouseis not stuck in the same queue behind another set of packets belongingto an elephant.

Alternatively, the forwarding element 135 may send packets associatedwith an elephant flow along different paths (e.g., equal-cost multipathrouting (ECMP) legs) to break the elephant flow into mice flows. Asanother example, the forwarding element 135 may send elephant flowtraffic along a separate physical network, such as an optical networkthat is more suitable for slow changing, bandwidth-intensive traffic. Insome embodiments, the elephant detector reports the elephant flow to anetwork controller (e.g., a software-defined networking controller) thatcan configure one or more forwarding elements (e.g., the forwardingelement 100) to handle the elephant flow.

Having described an example of detecting an elephant flow, a processwill now be described by reference to FIG. 2 . FIG. 2 conceptuallyillustrates a process 200 that some embodiments implement in order todetect elephant flows. In some embodiments, the process 200 is performedby one or more components shown in FIG. 1 , such as the data flowinterception agent 145 and the elephant detector 125.

As shown in FIG. 2 , the process 200 begins when it identifies (at 205)a new flow of data that is associated with a machine. The data flow canbe an inbound or outbound data flow. The process 200 then determines (at210) whether the amount of data being sent or received is greater than athreshold value. If the amount is not over, the process 200 ends.However, if the amount is over, the process 200 specifies (at 215) thatthe data flow is an elephant flow. The process 200 then reports (at 220)the elephant flow to an agent that is interested in the report. Theagent that receives a message regarding the elephant flow may beoperating on the same operating system (OS) as the elephant detector,the same machine (e.g., as a part of a virtual switch, as a part of ahypervisor, as part of a service virtual machine), or another machine ordevice (e.g., as part of a network controller which controls one or moresoftware or hardware forwarding elements, as a part of hardware switch,as part of a dedicated appliance, etc.). In some embodiments, the agenton the same machine facilitates in marking packets associated with theelephant flow with a particular mark. As an example, the agent may markeach packet associated with an elephant flow using a DifferentiatedServices Code Point (DSCP) field that provide different levels ofservice to be assigned to network traffics, such as IP packets. Theprocess 200 then ends.

In some embodiments, the process 200 performs other calculation todetermine whether a data flow is an elephant flow. In conjunction withbyte count or instead of it, the process 200 of some embodiments factorsin time. As an example, the process 200 might detect an elephant flowsolely based on how long the data flow has been associated with themachine. That is, if the duration of the data flow is over a set periodof time, the process 200 might determine that the data flow is anelephant flow. The duration can be calculated based on how long anetwork connection has been opened to handle the data flow. Also,instead of byte count, the process 200 might calculate bit rate or bytesper second. The bit rate can be used to allow elephant flows with slowdata transfer rate to progress as normal. This is because such elephantflows with slow transfer rate may not be contributing or at leastsignificantly in the latency of other data flows.

Some embodiments perform variations on the process 200. The specificoperations of the process 200 may not be performed in the exact ordershown and described. The specific operations may not be performed in onecontinuous series of operations, and different specific operations maybe performed in different embodiments.

Several more examples of detection and handling elephant flows will bedescribed in detail below. Section I describes several additionalexamples of detecting elephant flows based on machine introspection. Inparticular, Section I.A describes an example of detecting a new dataflow by intercepting a network connection that is being opened on themachine. Section I.B then describes an example of detecting a new dataflow by capturing a request to transfer a file. This is followed bySection I.C, which describes an example of detecting an elephant flowusing a combination of machine introspection and packet inspection.Section II then describes an example electronic system with which someembodiments of the invention are implemented.

I. Examples of Detecting Elephant Flows Based on Machine Introspection

In some embodiments, the system detects an elephant flow by monitoring,among other things, the operations of a machine. In detecting, thesystem uses machine introspection to intercept a new network connectionthat is being initiated on the machine. The system then identifiescontext information associated with the connection. The contextinformation can include one or more of the following: the name of thesource machine, the address (e.g., MAC address, IP address) associatedwith the source machine, the address associated with a destinationmachine, a port number for the source and/or destination, theapplication that initiated the call, user data (e.g., username of theperson that is logged onto the machine), etc. The system then usesmachine introspection to track how much data the machine is sending orreceiving through the connection. The system then determines, based onthe amount of data being sent or received, whether the data flowassociated with the connection is an elephant flow. If the systemdetects an elephant flow, it reports the elephant flow and theassociated context information to an agent (e.g., a forwarding element,a network controller) that is interested in the elephant flow.

The system of some embodiments can also, or alternatively, detect anelephant flow based on high-level application calls, rather thanlow-level system calls. As an example, the system might detect anOS/library or application programming interface (API) call to send orreceive a file. The system then determines whether an elephant flow isassociated with the call by identifying the size of the file that isbeing transferred. If the size is greater than a threshold value, thesystem then reports the elephant flow to the agent. In some embodiments,the system detects an elephant flow using a combination of machineintrospection and packet inspection. Several such examples will now bedescribed below by reference to FIG. 3-10 .

A. Introspecting Control Path and Data Path

FIG. 3 illustrates an example of inspecting operations of a machine todetect an elephant flow. Specifically, this figure shows in four stages301-304 how an introspection agent 345 on a machine 305 detects a newnetwork connection and reports the new connection to an elephantdetector 340. The elephant detector 340 then determines the amount ofdata transferred with the network connection. If the amount is over athreshold value, the elephant flow detector 340 sends a messageregarding the elephant flow to any party or agent that is interested insuch a report.

In the example of FIG. 3 , the machine 305 is a guest virtual machine(VM) that executes on a host machine 300. However, it should beunderstood that the example shown in the figure is equally applicable toa physical machine. As shown, the introspection agent 345 is a thinagent that is installed on the operating system (OS) of the guest VM305. The elephant detector 340 is a component that operates on the hostmachine 300. For instance, the elephant detector 340 may be a componentthat executes on a separate VM than the guest VM 305. The figure alsoshows a number of applications 310-320. These applications could be anydifferent type of applications that send data over a network.

In some embodiments, the thin agent 345 operates on the VM 305 andintercepts different types of events that are occurring on the VM. Froma control path point of view, the thin agent 345 may intercept a newnetwork connection being made and/or a file being opened. For instance,when a user sends a file through an application (310, 315, or 320), thethin agent 345 may detect that the application has made a socket call toopen a new connection.

To trap control data events, the thin agent 345 of some embodimentsincludes a set of one or more control path interceptors 325. One exampleof a control path interceptor 325 is a network introspection module thatintercepts socket calls being made on the VM 305 to open a new networkconnection. For each intercepted call, the network introspection modulemay identify various pieces of information (e.g., contextualinformation) associated with the call. The network introspection modulemay identify the name of the VM, the address (e.g., MAC address, IPaddress) associated with the VM, the address associated with thedestination machine, a port number for the source and/or destination,the application that initiated the call, user data (e.g., username ofthe person that is logged onto the VM), etc. The thin agent 345 caninclude different types of control path interceptors to perform machineintrospection, such as a file introspection module that detects calls toopen different files.

To trap information on the data path, the thin agent 345 of someembodiments includes a set of one or more data path interceptors 330. Insome embodiments, a data path interceptor 330 identifies the size ofdata associated with each packet transferred over the networkconnection. In some embodiments, the data size is the size of thepayload or user data of the packet. Alternatively, the data size may bethe actual size of the packet including one or more protocol headers andtrailers. In some embodiments, the data path interceptor 330 reports thedata size associated with each packet to the elephant detector 340. Tooptimize processing, the data path interceptor may report the data sizeonce it reaches a certain limit, in some embodiments.

The elephant detector 340 of some embodiments tracks the amount of datathat has been transferred (e.g., sent or received) with the networkconnection. In determining the amount of data being sent or received,the elephant detector may receive the data size of each packet with aunique identifier for the data flow or the network connection. If thedata amount reaches a threshold value, the elephant detector thencorrelates the unique identifier with the context information.Thereafter, the elephant detector may report the elephant flow and theassociated context information.

In some embodiments, the thin agent 345 or another component on theguest VM 305 performs the data size aggregation and elephant flowdetection. For instance, if the thin agent is aware of the thresholdvalue, the thin agent can add the data size associated with each packetto determine whether the amount of data transferred is greater than thethreshold value.

As shown in FIG. 3 , a guest introspection host module 335 (hereinafterreferred to as a multiplexer (MUX)) operates on the host 300. The MUX335 receives intercepted data from the thin agent 345. The MUX 335 thenforwards the intercepted data to one or more components, appliances,and/or devices that are interested in that data. For example, acomponent operating on the host 300 may be registered with the MUX 335to receive an asynchronous notification each time a certain type ofcontrol path event is intercepted by the thin agent 345 on the VM 305.

In some embodiments, the elephant detector 340 detects elephant flowsusing information provided by the thin agent 345 via the MUX 335. In theexample of FIG. 3 , the elephant detector is shown as a component thatexecute on the host 300. For instance, the elephant detector 340 may bea component that executes on a separate VM than the guest VM 305.Alternatively, the elephant detector 340 may be a part of a hypervisorthat executes on the host, a component that executes on a separatephysical machine, or a dedicated appliance.

Having described example components of FIG. 3 , example operations ofthese components will now be described by reference to the four stages301-304 that are illustrated in the figure. The first stage 301 showsthe VM 305 executing on the host 300. The thin agent 345 has beeninstalled on the OS of the VM 305. The control path interceptor 325 isregistered to intercept calls to create new network connections. Thethin agent 345 may have previously updated information regarding theuser that is logged onto the OS. The user information is updated becausethat the thin agent 345 of some embodiments reports the information witheach intercepted event.

The second stage 302 shows the thin agent 345 intercepting a requestmade by the application 310 to open a new connection. In particular, thecontrol path interceptor 325 intercepts a socket call and identifiesinformation regarding the socket call. The control path interceptor 325may identify the application 310 that initiated the call, user data, thetype of connection (e.g., inbound or outbound), the address (e.g., MACaddress, IP address) associated with the VM, the address associated withthe destination machine, a port number, etc.

The second stage 302 also shows the thin agent 345 facilitating indetecting the amount of data being sent or received through the networkconnection. As mentioned above, the data path interceptor 330 reportsthe file or data size associated with each packet to the elephantdetector 340. To optimize processing, the data path interceptor mayreport the data size once it reaches a certain limit, in someembodiments.

The third stage 303 illustrates the thin agent 345 sending data to theMUX 335 regarding the new connection. The thin agent 345 of someembodiments sends the data upon trapping the data. That is, the thinagent 345 does not wait for a specified period of time but sends thedata immediately to the MUX 335. The thin agent 345 might format thedata in a particular format prior to sending it to the MUX 335. Forinstance, the thin agent might encode the data in JavaScript ObjectNotation (JSON) or Extensible Markup Language (XML). In someembodiments, the thin agent 345 maintains a local copy of each event.For a given session, the thin agent 345 may maintain a log of allintercepted events. The local copy provides a backup in case there is acommunication error between the thin agent 345 and the MUX 335.

The fourth stage 304 illustrates the MUX 335 sending the data to theelephant detector 340. Here, the MUX 335 provides the dataasynchronously or synchronously to the elephant detector 340. The MUX335 may also store the data in storage (not shown) prior to sending thedata to the elephant detector 340. The MUX 335 may receive the data inone format and send the data in the same format or different format tothe elephant detector 340, in some embodiments.

In the fourth stage 304, the elephant detector 340 receives the datafrom the MUX 335. The elephant detector 340 then uses the data to trackthe amount of data flowing through the network connection. If the amountexceeds a threshold value, the elephant detector 340 may specify thatthe data flow associated with the connection is an elephant flow. Theelephant detector 340 may then report the elephant flow to agent orparty that is interested in the report. For instance, in someembodiments, the elephant detector reports the elephant flow to anetwork controller, which in turn configures one or more forwardingelements to handle the elephant flow. Alternatively, the report may besent directly to a forwarding element.

In the example described above, the thin agent 345 intercepts theinitiation of a new connection, and gathers data relating to theconnection (e.g., data size of each packet, context data). The thinagent 345 then provides the data to the MUX 335, which in turn providesthe data to the elephant flow detector 340.

FIG. 4 illustrates a process 400 that some embodiments implement tointercept a new connection for a data flow and determine whether thedata flow is an elephant flow. In some embodiments, the process 400 isperformed by one or more components shown in FIG. 3 , such as the thinagent 345 and the elephant detector 340.

As shown in FIG. 4 , the process 400 begins when it registers (at 405)to intercept the initiation of any new network connections (e.g.,outbound network connections, inbound network connections). The process400 then begins detecting whether a new connection has been initiated.For instance, the process 400 of some embodiments detects (at 410)whether a socket call has been made to open a new connection. If such acall has been made, the process 400 identifies (at 415) the application,the user, VM context information associate with the connection. Theprocess 400 then tracks (at 420) the amount of data being transferred(e.g., sent or received) over the network connection.

At 425, the process 400 determines whether the amount of data being sentor received is greater than a threshold value. If the amount is over,the process 400 specifies (at 430) that the data flow is an elephantflow. The process 400 then reports (at 435) the elephant flow (e.g., toa forwarding element and/or a network controller). The agent thatreceives a message regarding the elephant flow may be operating on thesame operating system (OS) as the elephant detector 340, the samemachine (e.g., as a part of a virtual switch, as a part of a hypervisor,as part of a service virtual machine, etc.) or another machine or device(e.g., as part of a network controller which controls one or moresoftware or hardware forwarding elements, as a part of hardware switch,as part of a dedicated appliance, etc.). In some embodiments, the agenton the same machine facilitates in marking packets associated with theelephant flow with a particular mark. As an example, the agent may markeach packet associated with an elephant flow using a DifferentiatedServices Code Point (DSCP) field that provide different levels ofservice to be assigned to network traffics, such as IP packets. If theamount of data transferred is not greater than the threshold value, theprocess 400 assumes that the data flow associated with the connection isa mouse flow and does not report it. The process 400 then ends.

Some embodiments perform variations on the process 400. The specificoperations of the process 400 may not be performed in the exact ordershown and described. The specific operations may not be performed in onecontinuous series of operations, and different specific operations maybe performed in different embodiments. For instance, the trackingoperation 420 may be performed by different components, such as the thinagent or the elephant detector. As mentioned above, if the thin agent isaware of the threshold value, the thin agent may aggregate the data sizeand perform the elephant flow detection. Also, the thin agent might senddata to the elephant agent through one or more intermediary agents(e.g., the MUX).

B. High-Level Application Calls

In the example described above, the thin agent intercepts a socket callto open a new network connection. The system of some embodiments candetect an elephant flow based on an application call for a new filetransfer. As an example, the system might detect an OS/library orapplication programming interface (API) call to send or receive a file.The call may be associated with a particular network protocol fortransferring files from one network host to another network host.Examples of different widely used data transfer protocols include filetransfer protocol (FTP), Secure Shell (SSH) file transfer protocol, BitTorrent, etc. The system then determines whether an elephant flow isassociated with call by identifying the size of the file that is beingtransferred with the call. If the size is greater than a thresholdvalue, the system then reports the elephant flow to an agent that isinterested in the report.

FIG. 5 illustrates an example of intercepting a high-level applicationcall to detect an elephant flow. Four operational stages 505-520 of thehost 500 are shown in this figure. This figure is similar to FIG. 3 ,except that the control and data path interceptors 325 and 330 have beenreplaced by a set of API call interceptors 525.

The first stage 505 shows the VM 530 executing on the host 500. The thinagent 345 has been installed on the OS of the VM 530. The API callinterceptor 525 is registered to intercept high-level application callsto send or receive data over a network. The thin agent 345 may havepreviously updated information regarding the user that is logged ontothe OS. The user information is updated because that the thin agent 345of some embodiments reports the information with each intercepted event.

The second stage 510 shows the thin agent 345 intercepting an API callmade by the application 310. In particular, the API call interceptor 525intercepts an API call to transmit a file on a socket. The API callinterceptor 525 may identify the application 310 that initiated thecall, user data, the type of connection (e.g., inbound or outbound), theaddress (e.g., MAC address, IP address) associated with the VM, theaddress associated with the destination machine, a port number, etc.

The second stage 510 also shows the thin agent 345 identifying the sizeof the data that is to be sent or received over the new connection. Thethird stage 515 illustrates the thin agent 345 sending data to the MUX335 regarding the API call. The fourth stage 520 illustrates the MUX 335sending the data to the elephant detector 340.

In the fourth stage 520, the elephant detector 340 receives the datafrom the MUX 335. Having received the data, the elephant detector 340then analyzes the data to detect an elephant flow. In some embodiments,the elephant detector 340 receives the size of the data that is beingtransferred over the network connection with the API call. The elephantdetector 340 may compare the size of the data to a threshold value. Ifthe size exceeds the threshold value, the elephant detector 340 mayspecify that the data flow associated with the API call is an elephantflow. The elephant detector 340 may then report the elephant flow toagent or party that is interested in the report. For instance, in someembodiments, the elephant detector reports the elephant flow to anetwork controller, which in turn configures one or more forwardingelements to handle the elephant flow. Alternatively, the report may besent directly to a forwarding element.

In the example described above, the thin agent 345 intercepts an APIcall and the size of file that is to be downloaded to or uploaded fromthe machine. The thin agent 345 then provides the data to the MUX 335,which in turn provides the data to the elephant flow detector 340. FIG.6 illustrates a process 600 that some embodiments implement to intercepta high-level application call to transmit a file on a socket. This isfollowed by FIG. 7 , which illustrates a process 700 that someembodiments implement to analyze the data to detect an elephant flow. Insome embodiments, the process 600 of FIG. 6 is performed by the thinagent 345, and the process 700 of FIG. 7 is performed by the elephantdetector 340.

As shown in FIG. 6 , the process 600 begins when it registers (at 605)to intercept API calls to send or receive a file one a socket. Theprocess 600 then begins detecting (at 610) whether a new file transferhas been initiated (e.g., a new file is being sent or received). If adetection has been made, the process 600 identifies (at 615) theapplication, the user, VM context information associate with theconnection. The process 600 then identifies (at 620) the size of thefile that is being transferred with the call. The process 600 thenreports (at 625) the context information and the size of the data (e.g.,to the MUX). The process 600 then ends.

As shown in FIG. 7 , the process 700 begins when it receives (at 705)the context info and the size of file. Here, the dashed arrow betweenthe two processes 600 and 700 conceptually illustrate that the elephantagent might have received the data through an intermediary agent (e.g.,the MUX). At 710, the process 700 determines whether the size of thefile is greater than a threshold value. If so, the process 700 specifies(at 715) that the data flow associated with the API call is as anelephant flow.

At 720, the process 700 reports the elephant flow (e.g., to a forwardingelement and/or a network controller). If the size of the file is notgreater than a threshold value, the process 700 assumes that the dataflow associated with the connection is a mouse flow and does not reportit. The process 700 then ends. Some embodiments perform variations onthe processes 600 and 700. The specific operations of the processes 600and 700 may not be performed in the exact order shown and described. Thespecific operations may not be performed in one continuous series ofoperations, and different specific operations may be performed indifferent embodiments.

C. Combination of Machine Introspection and Packet Inspection

In some embodiments, the system detects an elephant flow using acombination of machine introspection and packet inspection. Indetecting, the system may intercept a socket call that is beinginitiated by an application, or may intercept a high-level API call. Indetermining the amount of data being sent, the system of someembodiments specifies a unique identifier for the network connection.The system then marks outbound data (e.g., packets) with a marking(e.g., a unique marking) and keeps track of the amount of data using themarking. If the amount of data sent reaches a threshold value, thesystem then correlates the unique identifier of the outbound trafficwith the context information. Thereafter, the system uses the contextinformation to relay a message regarding the detected elephant flow(e.g., to an agent that is interested in the report).

FIG. 8 illustrates an example of detecting an elephant flow using acombination of machine introspection and packet inspection. In thisexample, the machine introspection is performed by the thin agent 345executing on the VM 835. The packet inspection is performed by ahypervisor 825, namely the hypervisor's packet inspection module 830.

In this example, the hypervisor 825 is a bare metal hypervisor that runson top of the hardware of the host machine 800 and runs below anyoperating system (OS). The hypervisor 825 handles various managementtasks, such as memory management, processor scheduling, or any otheroperations for controlling the execution of the VM 835. Moreover, thehypervisor 825 can communicate with the VM 835 to achieve variousoperations (e.g., setting priorities). In some embodiments, thehypervisor 825 is one type of hypervisor (Xen, ESX, or KVM hypervisor)while, in other embodiments, the hypervisor 825 may be any other type ofhypervisor for providing hardware virtualization of the hardware on thehost 800.

To perform packet inspection, the hypervisor 825 includes the packetinspection module 830. For the purposes of elephant flow detection, thepacket inspection module 830 is used to determine the amount of datasent over a particular connection. For instance, the packet inspectionmodule of some embodiments keeps track of the number of packet sent(e.g., byte count and/or packet count) with a particular identifier(e.g., unique connection ID). In some embodiments, the packet inspectionmodule uses packet header data to track the amount of data sent orreceived. For instance, the packet inspection module might identify thesize of the payload of a packet from a header field. If the amount ofdata sent reaches a threshold value, the system then correlates theparticular identifier with the context information associated with theconnection. Thereafter, the hypervisor 825 uses the context informationto report the elephant flow.

Having described example components of FIG. 8 , example operations ofthese components will now be described by reference to the four stages805-820 that are illustrated in the figure. The first stage 805 showsthe VM 835 executing on the host 800. The thin agent 345 has beeninstalled on the OS of the VM 835. The control path interceptor 325 isregistered to intercept calls to create new network connections. Thethin agent 345 may have previously updated information regarding theuser that is logged onto the OS. The user information is updated becausethat the thin agent 345 of some embodiments reports the information witheach intercepted event.

The second stage 810 shows the thin agent 345 intercepting a requestmade by the application 310 to open a new connection. In particular, thecontrol path interceptor 325 intercepts a socket call and identifiesinformation regarding the socket call. The control path interceptor 325may identify the application 310 that initiated the call, user data, thetype of connection (e.g., inbound or outbound), the address (e.g., MACaddress, IP address) associated with the VM, the address associated withthe destination machine, a port number, etc.

The third stage 815 illustrates the thin agent 345 sending data to theMUX 335 regarding the new connection. Prior to sending the data, thethin agent of some embodiments specifies an identifier (e.g., a uniqueidentifier) for the connection. This unique identifier allows the packetinspection module 830 to keep track of the number of bytes sent and/orthe number of packets sent thorough the connection.

The fourth stage 820 illustrates the MUX 335 sending the data to thehypervisor 825. Here, the MUX 335 provides the data asynchronously orsynchronously to the hypervisor 825. In the fourth stage 820, thehypervisor 825 receives the data from the MUX 335. Having received thedata, the packet inspection module then keeps track of the data sentthrough the connection. For example, the packet inspection module maystore statistics relating to the packets with a particular marking. Ifthe amount of data sent reaches a threshold value, the hypervisor 825then correlates the particular marking with the context informationassociated with the connection. The hypervisor 825 then uses the contextinformation to report the elephant flow. Here, the control pathinterceptor provides the context to the packet inspection module via theMUX. Also, each packet is stamped with an ID. This ID is then used tocorrelate the data being transferred with a connection (e.g., initiatedby an application or user) and identify the elephant flow. In someembodiments, the hypervisor 825 reports the elephant flow to a networkcontroller, which in turn configures one or more forwarding elements tohandle the elephant flow. Alternatively, the report may be sent directlyto a forwarding element.

In the example described above, the thin agent 345 intercepts theinitiation of a new connection and the hypervisor identifies the amountof data that has been sent through the connection. FIG. 9 illustrates aprocess 900 that some embodiments implement to intercept a newconnection and trap information regarding the connection. This isfollowed by FIG. 10 , which illustrates a process 1000 that someembodiments implement to analyze the data to detect an elephant flow. Insome embodiments, the process 900 of FIG. 9 is performed by the thinagent 345, and the process 1000 of FIG. 10 is performed by thehypervisor 825.

As shown in FIG. 9 , the process 900 begins when it registers (at 905)to intercept the initiation of any new network connections (e.g.,outbound network connections). The process 900 then begins detectingwhether a new connection has been initiated. For instance, the process900 of some embodiments detects (at 910) whether a socket call has beenmade to open a new connection. If a detection has been made, the process900 identifies (at 915) the application, the user, and/or VM contextinformation associate with the connection. The process then then definesor specifies (at 920) a unique marking to mark packets that are sentthrough the connection. The process 900 then reports (at 925) thecontext information and the marking. For instance, in the exampledescribed above, the control path interceptor provides the context tothe packet inspection module via the MUX. Also, each packet is stampedwith a marking. The process 900 then ends. In some embodiments, theprocess 900 returns to operation 920 when it detects another newconnection.

As shown in FIG. 10 , the process 1000 begins when it receives (at 1005)the context info and the marking. Here, the dashed arrow between the twoprocesses 900 and 1000 conceptually illustrate that the hypervisor mighthave received the data through an intermediary agent (e.g., the MUX).

At 1010, the process 1000 starts inspecting packets sent through theconnection using the marking. The marking relates the packets to theconnection. In particular, the process 1000 identifies (at 1010) apacket with the marking. Based on the identified packet, the process1000 determines (at 1015) whether the amount of data sent is greaterthan a threshold value. If so, the process 1000 specifies (at 1025) thatthe data flow associated with the connection is as an elephant flow.That is, the marking is used to correlate the data being transferredwith a connection (e.g., initiated by an application or user) andidentify the elephant flow. The process 1000 then reports (at 1030) theelephant flow to an agent that is interested in the report. If the sizeof the data is not greater than a threshold value, the process 1000proceeds to 1020, which is described below.

The process 1000 determines (at 1020) whether there is another packetbeing sent through the connection. If there is another packet, theprocess 1000 returns to 1010, which is described above. Otherwise, theprocess 1000 ends. Some embodiments perform variations on the processes900 and 1000. The specific operations of the processes 900 and 1000 maynot be performed in the exact order shown and described. The specificoperations may not be performed in one continuous series of operations,and different specific operations may be performed in differentembodiments.

For instance, in the example described above, the process 1000 inspectspacket sent through a connection. In some embodiments, the process mightperform similar operations for incoming packets. However, one or more ofthe detection rules are applied only based on the destination identity.This is because the process may only have the destination identityinformation and do not have the source identify information. As anexample, the process might identify the destination IP address anddestination port, and track the amount of data sent to the destinationmachine.

II. Electronic System

Many of the above-described features and applications are implemented assoftware processes that are specified as a set of instructions recordedon a computer readable storage medium (also referred to as computerreadable medium). When these instructions are executed by one or morecomputational or processing unit(s) (e.g., one or more processors, coresof processors, or other processing units), they cause the processingunit(s) to perform the actions indicated in the instructions. Examplesof computer readable media include, but are not limited to, CD-ROMs,flash drives, random access memory (RAM) chips, hard drives, erasableprogrammable read-only memories (EPROMs), electrically erasableprogrammable read-only memories (EEPROMs), etc. The computer readablemedia does not include carrier waves and electronic signals passingwirelessly or over wired connections.

In this specification, the term “software” is meant to include firmwareresiding in read-only memory or applications stored in magnetic storage,which can be read into memory for processing by a processor. Also, insome embodiments, multiple software inventions can be implemented assub-parts of a larger program while remaining distinct softwareinventions. In some embodiments, multiple software inventions can alsobe implemented as separate programs. Finally, any combination ofseparate programs that together implement a software invention describedhere is within the scope of the invention. In some embodiments, thesoftware programs, when installed to operate on one or more electronicsystems, define one or more specific machine implementations thatexecute and perform the operations of the software programs.

FIG. 11 conceptually illustrates an electronic system 1100 with whichsome embodiments of the invention are implemented. The electronic system1100 may be a computer (e.g., a desktop computer, personal computer,tablet computer, etc.), server, dedicated switch, phone, PDA, or anyother sort of electronic or computing device. Such an electronic systemincludes various types of computer readable media and interfaces forvarious other types of computer readable media. Electronic system 1100includes a bus 1105, processing unit(s) 1110, a system memory 1125, aread-only memory 1130, a permanent storage device 1135, input devices1140, and output devices 1145.

The bus 1105 collectively represents all system, peripheral, and chipsetbuses that communicatively connect the numerous internal devices of theelectronic system 1100. For instance, the bus 1105 communicativelyconnects the processing unit(s) 1110 with the read-only memory 1130, thesystem memory 1125, and the permanent storage device 1135.

From these various memory units, the processing unit(s) 1110 retrievesinstructions to execute and data to process in order to execute theprocesses of the invention. The processing unit(s) may be a singleprocessor or a multi-core processor in different embodiments.

The read-only-memory (ROM) 1130 stores static data and instructions thatare needed by the processing unit(s) 1110 and other modules of theelectronic system. The permanent storage device 1135, on the other hand,is a read-and-write memory device. This device is a non-volatile memoryunit that stores instructions and data even when the electronic system1100 is off. Some embodiments of the invention use a mass-storage device(such as a magnetic or optical disk and its corresponding disk drive) asthe permanent storage device 1135.

Other embodiments use a removable storage device (such as a floppy disk,flash memory device, etc., and its corresponding drive) as the permanentstorage device. Like the permanent storage device 1135, the systemmemory 1125 is a read-and-write memory device. However, unlike storagedevice 1135, the system memory 1125 is a volatile read-and-write memory,such a random access memory. The system memory 1125 stores some of theinstructions and data that the processor needs at runtime. In someembodiments, the invention's processes are stored in the system memory1125, the permanent storage device 1135, and/or the read-only memory1130. From these various memory units, the processing unit(s) 1110retrieves instructions to execute and data to process in order toexecute the processes of some embodiments.

The bus 1105 also connects to the input and output devices 1140 and1145. The input devices 1140 enable the user to communicate informationand select commands to the electronic system. The input devices 1140include alphanumeric keyboards and pointing devices (also called “cursorcontrol devices”), cameras (e.g., webcams), microphones or similardevices for receiving voice commands, etc. The output devices 1145display images generated by the electronic system or otherwise outputdata. The output devices 1145 include printers and display devices, suchas cathode ray tubes (CRT) or liquid crystal displays (LCD), as well asspeakers or similar audio output devices. Some embodiments includedevices such as a touchscreen that function as both input and outputdevices.

Finally, as shown in FIG. 11 , bus 1105 also couples electronic system1100 to a network 1165 through a network adapter (not shown). In thismanner, the computer can be a part of a network of computers (such as alocal area network (“LAN”), a wide area network (“WAN”), or an Intranet,or a network of networks, such as the Internet. Any or all components ofelectronic system 1100 may be used in conjunction with the invention.

Some embodiments include electronic components, such as microprocessors,storage and memory that store computer program instructions in amachine-readable or computer-readable medium (alternatively referred toas computer-readable storage media, machine-readable media, ormachine-readable storage media). Some examples of such computer-readablemedia include RAM, ROM, read-only compact discs (CD-ROM), recordablecompact discs (CD-R), rewritable compact discs (CD-RW), read-onlydigital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM), a varietyof recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.),flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc.),magnetic and/or solid state hard drives, read-only and recordableBlu-Ray® discs, ultra density optical discs, any other optical ormagnetic media, and floppy disks. The computer-readable media may storea computer program that is executable by at least one processing unitand includes sets of instructions for performing various operations.Examples of computer programs or computer code include machine code,such as is produced by a compiler, and files including higher-level codethat are executed by a computer, an electronic component, or amicroprocessor using an interpreter.

While the above discussion primarily refers to microprocessor ormulti-core processors that execute software, some embodiments areperformed by one or more integrated circuits, such as applicationspecific integrated circuits (ASICs) or field programmable gate arrays(FPGAs). In some embodiments, such integrated circuits executeinstructions that are stored on the circuit itself. In addition, someembodiments execute software stored in programmable logic devices(PLDs), ROM, or RAM devices.

As used in this specification and any claims of this application, theterms “computer”, “server”, “processor”, and “memory” all refer toelectronic or other technological devices. These terms exclude people orgroups of people. For the purposes of the specification, the termsdisplay or displaying means displaying on an electronic device. As usedin this specification and any claims of this application, the terms“computer readable medium,” “computer readable media,” and “machinereadable medium” are entirely restricted to tangible, physical objectsthat store information in a form that is readable by a computer. Theseterms exclude any wireless signals, wired download signals, and anyother ephemeral signals.

While the invention has been described with reference to numerousspecific details, it should be understood that the invention can beembodied in other specific forms without departing from the spirit ofthe invention. In addition, a number of the figures (including FIGS. 2,4, 6, 7, 9, and 10 ) conceptually illustrate processes. The specificoperations of these processes may not be performed in the exact ordershown and described. The specific operations may not be performed in onecontinuous series of operations, and different specific operations maybe performed in different embodiments. Furthermore, the process could beimplemented using several sub-processes, or as part of a larger macroprocess. Thus, it should be understood that the invention is not to belimited by the foregoing illustrative details, but rather is to bedefined by the appended claims.

What is claimed is:
 1. A method for detecting an elephant flow byinspecting operations of a machine that operates on a physical hostcomputer, the method comprising: at a detector operating on the physicalhost computer, receiving information from an agent executing on themachine regarding a file transfer initiated by an application executingon the machine, the information comprising a size of the file to betransferred and a data flow associated with the file transfer; when thefile size exceeds a threshold, specifying the data flow associated withthe file transfer as an elephant flow; and reporting that the data flowis an elephant flow, wherein a managed forwarding element processes thedata associated with the detected elephant flow differently from otherflows not detected as elephant flows.
 2. The method of claim 1, whereinthe agent executing on the machine detects an application programminginterface (API) call regarding the file transfer.
 3. The method of claim2, wherein the API call is associated with a particular data transferprotocol for transferring files between machines.
 4. The method of claim1, wherein the information received from the agent comprises at leastone of (i) the application that initiated the file transfer, (ii) userdata, and (iii) whether the file transfer is inbound or outbound.
 5. Themethod of claim 1, wherein reporting that the data flow is an elephantflow comprises reporting the data flow to a network controller.
 6. Themethod of claim 1, wherein the network controller configures the managedforwarding element to process the data associated with the elephant flowdifferently.
 7. The method of claim 1, wherein the detector receives theinformation from the agent via a multiplexer module.
 8. A non-transitorymachine-readable medium storing a detector program that when executed byat least one processing unit of a physical host computer detects anelephant flow by inspecting operations of a machine that operates on thephysical host computer, the program comprising sets of instructions for:receiving information from an agent executing on the machine regarding afile transfer initiated by an application executing on the machine, theinformation comprising a size of the file to be transferred and a dataflow associated with the file transfer; when the file size exceeds athreshold, specifying the data flow associated with the file transfer asan elephant flow; and reporting that the data flow is an elephant flow,wherein a managed forwarding element processes the data associated withthe detected elephant flow differently from other flows not detected aselephant flows.
 9. The non-transitory machine-readable medium of claim8, wherein the agent executing on the machine detects an applicationprogramming interface (API) call regarding the file transfer.
 10. Thenon-transitory machine-readable medium of claim 8, wherein the API callis associated with a particular data transfer protocol for transferringfiles between machines.
 11. The non-transitory machine-readable mediumof claim 8, wherein the information received from the agent comprises atleast one of (i) the application that initiated the file transfer, (ii)user data, and (iii) whether the file transfer is inbound or outbound.12. The non-transitory machine-readable medium of claim 8, wherein theset of instructions for reporting that the data flow is an elephant flowcomprises a set of instructions for reporting the data flow to a networkcontroller.
 13. A method of inspecting operations of a machine to detectan elephant flow, the method comprising: at the machine, detecting aninitiation of a new data flow to transfer data; tracking an amount ofdata being transferred in the data flow; determining, based on theamount of data transferred, whether the data flow is an elephant flow;and if the determination is made that the data flow is an elephant flow,specifying the data flow as elephant flow and identifying one or morepieces of information associated with the elephant flow in order toprocess the data associated with detected elephant flow different fromother non-detected flows.
 14. The method of claim 13, wherein themachine is a virtual machine (VM) or a physical machine.
 15. The methodof claim 13, wherein the detecting comprises intercepting a networkconnection that is being opened on the machine.
 16. The method of claim15, wherein the detecting further comprises intercepting a socket callthat is made to open the network connection.
 17. The method of claim 13,wherein the determining comprises (i) comparing the amount of datatransferred with a threshold value and (ii) specifying, if the amount ofdata being transferred is greater than a threshold value, that the dataflow is an elephant flow.
 18. The method of claim 13 further comprisingsending a message regarding a detected elephant flow to an agent,wherein the agent receives the message and processes each packets in thedetected elephant flow different from other non-detected flows.
 19. Themethod of claim 18, wherein the machine is a source machine, wherein thecontext information includes at least one of the name of the sourcemachine, an address associated with the source machine, an addressassociated with the destination machine, a port number, anidentification of an application that initiated the new data flow, anduser data.